Word statistics in Blogs and RSS feeds: Towards empirical universal evidence

نویسندگان

  • Renaud Lambiotte
  • Marcel Ausloos
  • Mike Thelwall
چکیده

We focus on the statistics of word occurrences and of the waiting times between such occurrences in Blogs. Due to the heterogeneity of words’ frequencies, the empirical analysis is performed by studying classes of ”frequently-equivalent” words, i.e. by grouping words depending on their frequencies. Two limiting cases are considered: the dilute limit, i.e. for those words that are used less than once a day, and the dense limit for frequent words. In both cases, extreme events occur more frequently than expected from the Poisson hypothesis. These deviations from Poisson statistics reveal non-trivial time correlations between events that are associated with bursts of activities. The distribution of waiting times is shown to behave like a stretched exponential and to have the same shape for different sets of words sharing a common frequency, thereby revealing universal features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cobra: Content-based Filtering and Aggregation of Blogs and RSS Feeds

Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere” is a new hotbed of Internet-based media that represents a shift from mostly static content to dynamic, continuously-updated discussions. The problem is that finding and tracki...

متن کامل

Blogs Search Engine Using RSS Syndication and Fuzzy Parameters

The rapid development of the internet eventually increases the number of internet users triggering the need for an intelligent search engine that is able to minimize the search on world wide web (WWW) and find relevant information as requested. To overcome the issue of finding relevant information as well as minimizing the search on WWW, this paper proposes a search engine that is specifically ...

متن کامل

Revealing Student Blogging Activities Using RSS Feeds and LMS Logs

Blogs are an easy-to-use, free alternative to classic means of computer-mediated communication. Moreover, they are authentically aligned with web activity patterns of today’s students. The body of studies on integrating and implementing blogs in various educational settings has grown rapidly recently; however, it is often difficult to distill practical advice from these studies since the applic...

متن کامل

RSS Feed Recommendation

Introduction Really Simple Syndication (RSS) Feeds allows users to access blogs and articles in an easy to read format. It cuts out the overhead of navigating websites for content and allows users to get information more quickly. Currently, the user is in total control of their RSS feeds, adding and deleting feeds according to their tastes. This requires the user to actively search out RSS feed...

متن کامل

Foafing the Music: Bridging the Semantic Gap in Music Recommendation

In this paper we give an overview of the Foafing the Music system. The system uses the Friend of a Friend (FOAF) and RDF Site Summary (RSS) vocabularies for recommending music to a user, depending on the user’s musical tastes and listening habits. Music information (new album releases, podcast sessions, audio from MP3 blogs, related artists’ news and upcoming gigs) is gathered from thousands of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Informetrics

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2007